20 research outputs found
Rethinking Attribute Representation and Injection for Sentiment Classification
Text attributes, such as user and product information in product reviews,
have been used to improve the performance of sentiment classification models.
The de facto standard method is to incorporate them as additional biases in the
attention mechanism, and more performance gains are achieved by extending the
model architecture. In this paper, we show that the above method is the least
effective way to represent and inject attributes. To demonstrate this
hypothesis, unlike previous models with complicated architectures, we limit our
base model to a simple BiLSTM with attention classifier, and instead focus on
how and where the attributes should be incorporated in the model. We propose to
represent attributes as chunk-wise importance weight matrices and consider four
locations in the model (i.e., embedding, encoding, attention, classifier) to
inject attributes. Experiments show that our proposed method achieves
significant improvements over the standard approach and that attention
mechanism is the worst location to inject attributes, contradicting prior work.
We also outperform the state-of-the-art despite our use of a simple base model.
Finally, we show that these representations transfer well to other tasks. Model
implementation and datasets are released here:
https://github.com/rktamplayo/CHIM.Comment: EMNLP 201
Unsupervised Opinion Summarization with Noising and Denoising
The supervised training of high-capacity models on large datasets containing
hundreds of thousands of document-summary pairs is critical to the recent
success of deep learning techniques for abstractive summarization.
Unfortunately, in most domains (other than news) such training data is not
available and cannot be easily sourced. In this paper we enable the use of
supervised learning for the setting where there are only documents available
(e.g.,~product or business reviews) without ground truth summaries. We create a
synthetic dataset from a corpus of user reviews by sampling a review,
pretending it is a summary, and generating noisy versions thereof which we
treat as pseudo-review input. We introduce several linguistically motivated
noise generation functions and a summarization model which learns to denoise
the input and generate the original review. At test time, the model accepts
genuine reviews and generates a summary containing salient opinions, treating
those that do not reach consensus as noise. Extensive automatic and human
evaluation shows that our model brings substantial improvements over both
abstractive and extractive baselines.Comment: ACL 202